Detailed Analysis of different Strategies for Phrase Table Adaptation in SMT
نویسندگان
چکیده
This paper gives a detailed analysis of different approaches to adapt a statistical machine translation system towards a target domain using small amounts of parallel in-domain data. Therefore, we investigate the differences between the approaches addressing adaptation on the two main steps of building a translation model: The candidate selection and the phrase scoring. For the latter step we characterized the differences by four key aspects. We performed experiments on two different tasks of speech translation and analyzed the influence of the different aspects on the overall translation quality. On both tasks we could show significant improvements by using the presented adaptation techniques.
منابع مشابه
Literature Survey: Study of Reordering in Pivot Based SMT
Pivot Based SMT solves the problem of scarcity of source-target parallel corpus by introducing a third resource rich ‘pivot’ language. Triangulation method in Pivot Based SMT is a method that uses the pivot language to induce new phrase pairs into the phrase table, this process is known as ‘Phrase Table Triangulation’. Phrase Table Triangulation has been extensively studied by many researchers....
متن کاملA Comparison of Pivot Methods for Phrase-Based Statistical Machine Translation
We compare two pivot strategies for phrase-based statistical machine translation (SMT), namely phrase translation and sentence translation. The phrase translation strategy means that we directly construct a phrase translation table (phrase-table) of the source and target language pair from two phrase-tables; one constructed from the source language and English and one constructed from English a...
متن کاملDynamically Integrating Cross-Domain Translation Memory into Phrase-Based Machine Translation during Decoding
Our previous work focuses on combining translation memory (TM) and statistical machine translation (SMT) when the TM database and the SMT training set are the same. However, the TM database will deviate from the SMT training set in the real task when time goes by. In this work, we concentrate on the task when the TM database and the SMT training set are different and even from different domains...
متن کاملConnecting Phrase based Statistical Machine Translation Adaptation
Although more additional corpora are now available for Statistical Machine Translation (SMT), only the ones which belong to the same or similar domains with the original corpus can indeed enhance SMT performance directly. Most of the existing adaptation methods focus on sentence selection. In comparison, phrase is a smaller and more fine grained unit for data selection, therefore we propose a s...
متن کاملDynamic Models in Moses for Online Adaptation
Avery hot issue for research and industry is how to effectively integratemachine translation (MT)within computer assisted translation (CAT) software. This paper focuses on this issue, and more generally how to dynamically adapt phrase-based statistical machine translation (SMT) by exploiting external knowledge, like the post-editions from professional translators. We present an enhancement of t...
متن کامل